home *** CD-ROM | disk | FTP | other *** search
- Path: keats.ugrad.cs.ubc.ca!not-for-mail
- From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku)
- Newsgroups: comp.lang.ada,comp.lang.c,comp.lang.c++,comp.edu
- Subject: Re: ANSI C and POSIX (was Re: C/C++ knocks the crap out of Ada)
- Date: 9 Apr 1996 07:38:36 -0700
- Organization: Computer Science, University of B.C., Vancouver, B.C., Canada
- Message-ID: <4kdspcINN6ct@keats.ugrad.cs.ubc.ca>
- References: <JSA.96Feb16135027@organon.com> <dewar.828987544@schonberg> <4kbuebINNrho@keats.ugrad.cs.ubc.ca> <dewar.829048603@schonberg>
- NNTP-Posting-Host: keats.ugrad.cs.ubc.ca
-
- In article <dewar.829048603@schonberg>, Robert Dewar <dewar@cs.nyu.edu> wrote:
- >"This is so deeply entrenched in the realm of common sense that it isn't even
- >worth mentioning in a standard document! Nevertheless, I have access to the
- >POSIX.1 standard and will look into this."
- >
- >This seems complete nonsense. There are two possible semantics that ould
- >be defined for read (buffer must be at least size of the read argument,
- >or buffer must be at least size of data read). Both are easy to specify,
- >both are easy to implement. You cannot rely on common sense (especially
- >dubious reasoning about kernels and what not that are totally irrelevant
- >to the semantic specification). The idea that specs are derived from
-
- You are right. This has more to do with those unwritten rules that you
- mentioned earlier (my wording, not yours).
-
- Expecting that you only have to specify a buffer large enough to hold the
- actual data that will be read, while telling the read function that the buffer
- is bigger is just not reasonable.
-
- Suppose you don't know whether you may or may not lie in specifying the buffer
- size, since no documentation explicitly allows it nor prohibits it. Which way
- do you make the decision? Which method is safer? Giving a buffer that is as
- large as you promise it is, or giving a smaller buffer?
-
- There is no telling that even if you know 100% that so many bytes will be read,
- the rest of the buffer will not be accessed.
-
- >implementations (either by looking at the implementation, or reasoning
- >about it with "common sense" or otherwise) is completely unacceptable!
-
- You are the one who advocates empirical approaches: in a recent posting you
- said that if something works on all the platforms, it is portable regardless
- whether it invokes undefined behavior.
-
- >(though unfortunately very common, especially when people are writing in
- >a language that does not make a big deal about separating spec and
- >implementation details).
- >
- >My only at-hand sources are K&R, which has nothing whatever to say on
- >the subject, the Zortech C++ reference, which also has nothing to say,
- >(both describe read, but say nothing about the buffer length), and
- >the Microsoft Runtime Reference which talks about "attempting to
- >read n bytes", but is otherwise unclear.
- >
- >We are after all dealing with a language interface where in practice the
- >proper check (whichever it is) cannot be made, because the called routine
- >does not know the length of the buffer passed. I think a natural default
- >assumption, in the absence of any statement to the contrary, is that the
- >bytes are blindly read into the buffer, and disaster strikes if the number
- >of bytes read is greater than the buffer length, but otherwise all is well.
- >Unless there is a VERY clear statement of semantics to the contrary, I
- >don't see how anyone can call code that makes this assumption obviously
- >broken.
-
- You are right about that, of course. You can't call the code ``obviously
- broken'', but I would call the programmer imprudent.
-
- >This is of course a rather trivial detail but is instructive with regard
- >to the importance of writing precise specs. Kazimir's claim that the spec
- >obviously requires that the buffer length match the requested, rather
- >than actual length, based on some dubious reasoning about likely
- >implementation models is just the sort of thing that needs to be
-
- >eliminated from programming practices. Specs need to be made precise,
-
- But here there is a clear lack of precise specs! I'm advocating the _safer_,
- more _prudent_ assumption. There is clearly more opportunity to screw up if you
- falsely represent your buffer size to a system call or library function.
-
- Even if it were OK to do so on every system, the program may later change such
- that the hidden assumption is violated. Suddenly, not 68, but 113 bytes come
- from the file, for some reason, and the program fails of behaves strangely.
- Even all those UNIXes that check against the actual transfer size rather than
- buffer size will not necessarily catch this, since the check is usually only
- good to the granularity of a page.
-
- The current maintainer, of course, doesn't know what hack had been perpetrated
- and may be faced with tracing down problems that could have been avoided.
-
- >so that a caller knows EXACTLY what the requirements are without having
- >to guess, or, worse still, examine the actual implementation code.
-
- I agree. I was dismayed when I was not able to find a definitive answer in the
- POSIX.1 standard itself. These sorts of things should be specified so that the
- programmers don't have to rationalize about what is likely to be safer.
-
- Here is my ``dubious'' reasoning laid out step by step, so criticize at will:
-
- 1. In the C language, the size of an object has a specific meaning. If I
- malloc 100 bytes, or declare 100 bytes in static or automatic storage,
- the size of that object is not 101, not 1000, but 100 bytes.
-
- (Granted, the third argument of read() is not usually referred to as
- the _size_ of the object pointed at by the second argument, but as a
- _count_ of bytes to be read into the buffer. It does have a size_t type
- which is used in ANSI C to hold the sizes of objects, and is the
- return type of the sizeof operator).
-
- 2. No documentation has ever explicitly stated that the argument may be
- greater than the actual size of the object to which a pointer can
- be given.
-
- 3. In choosing between two alternatives, choose the safer one, all else
- being equal.
-
- 4. Even if the apparently less safe alternative is actually safe, it
- depends on preconditions in the program which may change, namely
- assumptions about how many bytes are left in the particular file, pipe
- or whatever. This will could cause problems in the maintenance cycle
- of the software.
- --
-
-